Golang : Check if user agent is a robot or crawler example
Problem:
You need to determine if the user agent that visiting your web server is a bot/robot/crawler. You have tried the hash map solution but found out that it can be easily broken if the robot version string changed. How to create a generic function that can detect if a user agent is a robot?
Solution:
Ported this solution from CodeIgniter for my own use. Feel free to adapt it for your own use.
Here you go!
package main
import (
"fmt"
"net/http"
"strings"
)
func is_robot(useragent string) bool {
// There are hundreds of bots but these are the most common.
// You can see other bots list at
// http://www.botsvsbrowsers.com/category/1/index.html
// the list below is taken from
// https://github.com/bcit-ci/CodeIgniter/blob/develop/system/libraries/User_agent.php
// Hash map/table method requires exact match of the user agent string and can be easily broken
// if the version number change. Therefore, it is better to check the user agent against a slice/dictionary
robots := []string{"Googlebot", "Google Page Speed Insights", "MSNBot", "Baiduspider", "Bing", "DuckDuckBot", "Inktomi Slurp", "Yahoo", "Ask Jeeves", "FastCrawler", "YandexBot", "MediaPartners Google", "Crazy Webcrawler", "AdsBot Google", "Feedfetcher Google", "Curious George", "facebookexternalhit"}
for _, bot := range robots {
if strings.Index(useragent, bot) > -1 {
return true
}
}
return false
}
func checkIfUserAgentIsRobot(w http.ResponseWriter, r *http.Request) {
ua := r.Header.Get("User-Agent")
fmt.Printf("user agent is: %s \n", ua)
w.Write([]byte("user agent is " + ua + "\n"))
result := "no"
if is_robot(ua) {
result = "yes"
}
fmt.Printf("user agent is a robot: %v \n", is_robot(ua))
w.Write([]byte("user agent is a robot:" + result + "\n"))
}
func main() {
http.HandleFunc("/", checkIfUserAgentIsRobot)
http.ListenAndServe(":8080", nil)
}
Output:
Browse page with Chrome browser
user agent is: Mozilla/5.0 (Macintosh; Intel Mac OS X 1085) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36
user agent is a robot: false
user agent is: Mozilla/5.0 (Macintosh; Intel Mac OS X 1085) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36
user agent is a robot: false
Browse page with Google Page Speed Insights bot
user agent is: Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Version/8.0 Mobile/12F70 Safari/600.1.4
user agent is a robot: true
user agent is: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Chrome/27.0.1453 Safari/537.36
user agent is a robot: true
References:
https://www.socketloop.com/tutorials/golang-check-if-item-is-in-slice-array
See also : Golang : How to determine if request or crawl is from Google robots
By Adam Ng
IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.
Advertisement
Tutorials
+19.5k Golang : Count JSON objects and convert to slice/array
+17.1k Golang : Get future or past hours, minutes or seconds
+62.3k Golang : Convert HTTP Response body to string
+31.7k Golang : Validate email address with regular expression
+8.9k Golang : does not implement flag.Value (missing Set method)
+6.4k Golang : When to use make or new?
+39.6k Golang : Convert to io.ReadSeeker type
+13.1k Golang : Increment string example
+9.8k Golang : Read file and convert content to string
+18.8k Golang : Populate dropdown with html/template example
+12k Golang : Print UTF-8 fonts on image example